Variational Bayes approach for model aggregation in unsupervised classification with Markovian dependency

نویسندگان

  • Stevenn Volant
  • Marie-Laure Martin-Magniette
  • Stéphane Robin
چکیده

We consider a binary unsupervised classification problem where each observation is associated with an unobserved label that we want to retrieve. More precisely, we assume that there are two groups of observation: normal and abnormal. The ‘normal’ observations are coming from a known distribution whereas the distribution of the ‘abnormal’ observations is unknown. Several models have been developed to fit this unknown distribution. In this paper, we propose an alternative based on a mixture of Gaussian distributions. The inference is done within a variational Bayesian framework and our aim is to infer the posterior probability of belonging to the class of interest. To this end, it makes no sense to estimate the mixture component number since each mixture model provides more or less relevant information to the posterior probability estimation. By computing a weighted average (named aggregated estimator) over the model collection, Bayesian Model Averaging (BMA) is one way of combining models in order to account for information provided by each model. The aim is then the estimation of the weights and the posterior probability for one specific model. In this work, we derive optimal approximations of these quantities from the variational theory and propose other approximations of the weights. To perform our method, we consider that the data are dependent (Markovian dependency) and hence we consider a Hidden Markov Model. A simulation study is carried out to evaluate the accuracy of the estimates in terms of classification. We also present an application to the analysis of public health surveillance systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data

We introduce Deep Variational Bayes Filters (DVBF), a new method for unsupervised learning and identification of latent Markovian state space models. Leveraging recent advances in Stochastic Gradient Variational Bayes, DVBF can overcome intractable inference distributions via variational inference. Thus, it can handle highly nonlinear input data with temporal and spatial dependencies such as im...

متن کامل

A New Approach for Text Documents Classification with Invasive Weed Optimization and Naive Bayes Classifier

With the fast increase of the documents, using Text Document Classification (TDC) methods has become a crucial matter. This paper presented a hybrid model of Invasive Weed Optimization (IWO) and Naive Bayes (NB) classifier (IWO-NB) for Feature Selection (FS) in order to reduce the big size of features space in TDC. TDC includes different actions such as text processing, feature extraction, form...

متن کامل

Nonparametric Inference for Auto-Encoding Variational Bayes

Variational approximations are an attractive approach for inference of latent variables in unsupervised learning. However, they are often computationally intractable when faced with large datasets. Recently, Variational Autoencoders (VAEs) Kingma and Welling [2014] have been proposed as a method to tackle this limitation. Their methodology is based on formulating the approximating posterior dis...

متن کامل

On a Dependency-based Semantic Space for Unsupervised Noun Sense Disambiguation with an Underlying Naïve Bayes Model

Recent studies refocus on usage of the Naïve Bayes model in unsupervised word sense disambiguation (WSD). They discuss the issue of feature selection for this statistical model, when used as clustering technique, and comment (Hristea, 2012) that it still holds a promise for unsupervised WSD. Within the various investigated types of feature selection, this ongoing research concentrates on syntac...

متن کامل

Properties of variational estimates of a mixture model for random graphs

Mixture models for random graphs have a complex dependency structure and a likelihood which is not computable even for moderate size networks. Variational and variational Bayes techniques are useful approaches for statistical inference of such complex models but their theoretical properties are not well known. We give a result about the consistency of variational estimates of the parameters of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Statistics & Data Analysis

دوره 56  شماره 

صفحات  -

تاریخ انتشار 2012